University of Hagen at CLEF 2008: Answer Validation Exercise
نویسنده
چکیده
RAVE (Real-time Answer Validation Engine) is a logic-based answer validator/selector designed for application in real-time question answering. RAVE uses the same toolchain for deep linguistic analysis and the same background knowledge as its predecessor (MAVE), which took part in the AVE 2007. However, a full logical answer check as in MAVE was not considered suitable for real-time answer validation since it requires parsing of all answer candidates. Therefore RAVE uses a simplified validation model where the prover only checks if the support passage contains a correct answer at all. This move from logic-based answer validation to logical validation of supporting snippets permits RAVE to avoid any parsing of answers, i.e. the system only needs a parse of the question and pre-computed snippet analyses. In this way very low validation/selection times can be achieved. Machine learning is used for assigning local validation scores using both logic-based and shallow features. The resulting local validation scores are improved by aggregation. One of the key features of RAVE is its innovative aggregation model, which is robust against duplicated information in the support passages. In this model, the effect of aggregation is controlled by the lexical diversity of the support passages for a given answer. If the support passages have no terms in common, then the aggregation has maximal effect and the passages are treated as providing independent evidence. Repetition of a support passage, by contrast, has no effect on the results of aggregation at all. In order to obtain a richer basis for aggregation, an active validation approach was chosen, i.e. the original pool of support passages in the AVE 2008 test set was enhanced by retrieving additional support passages from the CLEF corpora. This technique already proved effective in the AVE 2007. The development of RAVE is not finished yet, but the system already achieved an F-score of 0.39 and a selection rate of 0.61 compared to optimal selection. Judging from last year’s runs of MAVE (with a 0.93 selection rate and F-score of 0.72), this may look disappointing. However, the AVE task for German was much more difficult this year, and the F-score gain of RAVE (over the 100% yes baseline) and qa-accuracy gain (compared to random selection) are better than in last year’s runs of MAVE.
منابع مشابه
University of Hagen at QA@CLEF 2007: Coreference Resolution for Questions and Answer Merging
The German question answering (QA) system InSicht participated in QA@CLEF for the fourth time. InSicht realizes a deep QA approach: it builds on full sentence parses, rulebased inferences on semantic representations, and matching semantic representations derived from questions and document sentences. InSicht was improved for QA@CLEF 2007 in the following main areas: questions containing pronomi...
متن کاملThe LogAnswer Project at CLEF 2008: Towards Logic-Based Question Answering
LogAnswer is a logic-oriented question answering system jointly developed by the AI research group at the University of Koblenz-Landau and by the IICS at the University of Hagen. The system was designed to address two notorious problems of the logic-based approach: Achieving robustness and acceptable response times. The main innovation of LogAnswer is its use of logic for simultaneously extract...
متن کاملUniversity of Hagen at CLEF 2007: Answer Validation Exercise
MAVE (Multinet-based Answer VErification) is an answer validation system based on deep linguistic processing and logical inference originally developed for AVE 2006. Robustness of the entailment check is obtained by embedding the theorem prover in a constraint relaxation loop. The system can also be used for answer selection, which is then guided by the joint evidence of all available text pass...
متن کاملUniversity of Hagen at QA@CLEF 2008: Efficient Question Answering with Question Decomposition and Multiple Answer Streams
The German question answering (QA) system IRSAW (formerly: InSicht) participated in QA@CLEF for the fifth time. IRSAW was introduced in 2007, by integrating the deep answer producer InSicht, several shallow answer producers, and a logical validator. InSicht realizes a deep QA approach: it transforms documents to semantic representations using a parser, draws inferences on semantic representatio...
متن کاملINAOE at QA@CLEF 2008: Evaluating Answer Validation in Spanish Question Answering
This paper introduces the new INAOE’s answer validation method. This method is based on supervised learning approach that uses a set of attributes that capture some lexical-syntactic relations among the question, the answer and the given support text. In addition, the paper describes the evaluation of the proposed method at both the Spanish Answer validation Exercise (AVE 2008) and the Spanish ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008